perm filename A[VIS,HPM] blob
sn#139776 filedate 1975-01-13 generic text, type C, neo UTF8
COMMENT ā VALID 00002 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 Gad! I clobbered your previous mail by trying to be too
C00009 ENDMK
Cā;
Gad! I clobbered your previous mail by trying to be too
clever, to the point where recovery is impossible. You had one note
from BPM saying that there were no more paper versions of the
document, but that microfiche versions would soon be around, and
could be gotten from Patte. Fortunately there was nothing else.
The interconnection problem is solved better than it was ever
solved before! The solution is actually in the 1968 SJCC Batcher
paper. His merge sorting net is more expensive than the permutation
net in the Waksman paper, but ever so much easier to use. The answer
to how to make a communications net where the requests are not
necessarily a perfect permutation is towards the end, in the
paragraph labelled, appropriately enough, "Switching network with
conflict resolution". For n processors, his method involves a sorting
network for n items connected to a merging network which merges the
sorted requests with n additional dummy requests, exactly one for
each processor. The real messages and the dummy requests are shipped
out through this, leaving a trace of their path. They come out of the
merger with all the messages to a given processor clumped together,
in order of a priority number, (which constitutes the low order
address bits, as far as the switches are concerned), with the dummy
request for each processor heading each clump (because the dummy
requests are given the highest possible priority number). A circuit
now looks at adjacent pairs of messages in this list and exchanges
the dummy requests with the message immediately following it, if this
message is for the same address as the dummy (if it exists at all, it
is the highest priority real request for that destination). The net
is then run backwards, with messages taking the same path backwards
as they took forwards. The successful transfers come out where the
dummy requests went in, the machines which issued the successful
transfers getting the dummy (thus being informed of their success),
whereas unsuccessful requests are simply returned to sender. If
nobody wanted to talk to a particular processor, it is consoled by
receiving a dummy message.
This is quite nice, but is has the problem that the net can't
be pipelined, since the net has to freeze in a given state until the
messages filter back. This can be gotten around by sticking a few
extra bits into the message, indicating routing decisions, as it
filters through the net, and building a second sort of mirror image
of the net after the two way interchangers, which uses these extra
bits to guide the messages to the right place, allowing the messages
to be put end to end.
The multiple access memory construction in the next section
of the paper looks perfect for a bubble memory organization (after
all, if the bubbles are going to be shifted around and around, they
might as well do something useful as they move) for the
multiprocessor, allowing simultaneous access by all the processors to
the entire mass storage. This is for later, of course.
It occurred to me that Knuth might have something about all
this in Vol 3 (Sorting and Searching), and, sure enough,section 5.3.4
is called Networks for Sorting, and talks about Batcher's nets and
others. The conclusion there is that slightly cheaper solutions than
his are possible, but construction of these is complicated, and the
result is messy. I'm currently reading some of VanVoohris' (referred
to by Knuth) papers on nets with fewer elements, but it looks like
bitonic sorters really are the best bet for actual implementation,
mainly because of the fact that the delay time for all paths through
them is the same.